Linking Dutch Wikipedia Categories to EuroWordNet
نویسندگان
چکیده
Wikipedia provides category information for a large number of named entities but the category structure of Wikipedia is associative, and not always suitable for linguistic applications. For this reason, a merger of Wikipedia andWordNet has been proposed. In this paper, we address the word sense disambiguation problem that needs to be solved when linking Dutch Wikipedia categories to polysemous Dutch EuroWordNet literals. We show that a method based on automatically acquired predominant word senses outperforms a method based on word overlap between Wikipedia supercategories and WordNet hypernyms. We compare the coverage of the resulting categorization with that of a corpus-based system that uses automatically acquired category labels.
منابع مشابه
Cross-lingual Dutch to English alignment using EuroWordNet and Dutch Wikipedia
This paper describes a system for linking the thesaurus of the Netherlands Institute for Sound and Vision to English WordNet and dbpedia. We used EuroWordNet, a multilingual wordnet, and Dutch Wikipedia as intermediaries for the two alignments. EuroWordNet covers most of the subject terms in the thesaurus, but the organization of the cross-lingual links makes selection of the most appropriate E...
متن کاملCross-lingual Ontology Alignment using EuroWordNet and Wikipedia
This paper describes a system for linking the thesaurus of the Netherlands Institute for Sound and Vision to English WordNet and dbpedia. The thesaurus contains subject (concept) terms, and names of persons locations, and miscalleneous names. We used EuroWordNet, a multilingual wordnet, and Dutch Wikipedia as intermediaries for the two alignments. EuroWordNet covers most of the subject terms in...
متن کاملWhen to Cross Over? Cross-Language Linking Using Wikipedia for VideoCLEF 2009
We describe Dublin City University (DCU)’s participation in the VideoCLEF 2009 Linking Task. Two approaches were implemented using the Lemur information retrieval toolkit. Both approaches first extracted a search query from the transcriptions of the Dutch TV broadcasts. One method first performed search on a Dutch Wikipedia archive, then followed links to corresponding pages in the English Wiki...
متن کاملCrosslingual Countability Classification with EuroWordNet
We examine the hypothesis that noun countability is consistent for a given word semantics by way of a series of experiments involving EuroWordNet and the English and Dutch languages. The basic method involves determining a default set of countabilities for each EuroWordNet synset based on countability-mapped words in that synset, and testing the match between these countabilities and those of h...
متن کاملImproving the Precision of Synset Links Between Cornetto and Princeton WordNet
Knowledge-based multilingual language processing benefits from having access to correctly established relations between semantic lexicons, such as the links between different WordNets. WordNet linking is a process that can be sped up by the use of computational techniques. Manual evaluations of the partly automatically established synonym set (synset) relations between Dutch and English in Corn...
متن کامل